Development Note
December 18, 2001
Paul S. Prueitt,
PhD
Founder (2001),
OntologyStream Inc.
Development note
OSI is actively searching for investment and clients.
End of the year resources are very short. Meanwhile, the development of the software continues through discussions primarily between Cameron Jones, Paul Prueitt and Don Mitchell. Several side discussions are occurring, including discussions with Dr. Richard Ballard on ontology encoders. The eventChemistry e-forum is open and is supporting additional discussions.
It is important to continue the basic development of the technology even while we are waiting for positive results from the search for additional economic support. All that we can do is to develop the historical record and find a closure for those things that are keeping investment from occurring.
The development of the market is a somewhat separate process. To identify clients requires that we have real presentations on what the software will do for the client. At least temporarily we are not looking for a client in the computer intrusion domain – in deference to our recent client. It is hoped that after the first of the year, that management will wake up and restart the development and deployment process.
However, we must develop demonstrations for the other markets and figure out ways of making the case for our technology in each of the following markets.
More will be discussed on the marketing issue as this Development Note is written.
Section 1: One the issue of resolution
Up to now, we have not talked a great deal about the
pre-processor that develops “datawh.txt” files for thematic analysis. Some theoretical discussion is made on
December 17 in First
Report on eventChemistry ™ . However, this discussion has not
been detailed as we need it to be.
Thematic analysis will be taken up again soon. But we address a different issue now. This is the issue of compound resolutions.
Look at the data set in smallFables.zip.
a
b
A reduced set of tokens is used to develop a small
datawh.txt. The category A1 has only 54
atoms. The reduced set of token also
reduces the linkage between fables and shows a separation of prime categories
at the A1 level.
Figure 1b is the event scatter for the category B2.
The six atoms of category B2 can be fully resolved into a situated compound such as shown in Figure 2;
Figure 2: Situated Compound for category B2
This resolution is done by hand, as is the resolution of Figure 3a into Figure 3b. In both cases, the resolution is unique except for the angles made between the links.
A diffusion process leading to uniform lengths to the links and equal distances between atoms will tighten up the picture into a characteristic pattern. But these processes are sometimes conflicting and sometimes there is not sufficient constraints to impose a unique distribution.
a
b
Figure 3: Event chemistry from IDS data
These remarks are general in nature, but revealing of the technical issues related to imposing a set of rules on the evolution of a resolution of a scattering of atoms.
What does one do algorithmically to cause something like the transition between Figure 3a and Figure 3b? Given that such an algorithm exists then will this algorithm reasonably transform Figure 4 into a single resolution?
Figure 4: A more complex bag of atoms from the fable collection.
Mitchell’s code will resolve any bag of
(SLIP) atoms into a compound with specific link structure as reflected in the
real link analysis developed by the SLIP Analytic Conjecture. Of course this most makes sense, right now
if we have an event log with columns.
The columns need to be reified as tokens so
if the data were numerical then perhaps a fuzzification technique would make
the conversion. If the columns are
values from the nodes of a finite state machine, then the class of natural kind
is equal to the types of nodes in a small finite state machine. The I-Ching provides an example of a finite state
machine with 64 nodes. KOS is designed
to control the transitions of small finite state machine that are linked to
machine ontology – such as what Dr. Ballard has available in his Mark 3
ontology encoder.
If the columns are text elements then we bin
the text using a parse program and use the unique values as the nodes of a
finite state machine.
There are some topological issues related to
nearness being interpreted as closeness in meaning. Of course this is the major problem we have in general with
almost any technique in image understanding or text understanding or event log
understand. Part of the responsibility
for the human visual acuity and decision-making is to increase the constraints
on an under constrained aggregation (stochastic) process. This human touch encodes relationships of
interest.
The use of human visual acuity is suggested
first (at least in my work) in my report to Army research lab on Alex Zenkin’s cognitive
visualization. Dr. Prueitt has also
shared with Dr. Jones the work on the use of fractal decomposition and
fractal error.
Any new coding structure should be able to
expect a standard object model.
The Event Browser will understand the object
model and the data for a specific bag of atoms will be stored in a text
file. The Object Model is then used
conceptually as well as programmatically to transform the bag of atoms into a compound. Just like a chemical factory in a plant
composes atoms into chemical compounds.
The object model has
1)
The
objectSpace, which is now 3-D and infinite in span. However the size of the Event Browser window creates a boundary
in which the bag of atoms is scattered and within which all transformation are
constrained.
2)
Atoms and links
are objects or constructs. All atoms
and all links for a bag are computed when the Event Browser is pointed at a
node in the SLIP Framework. This
information is stored in an ASCII file and the data is placed into process
memory. The ASCII file is made
available immediately, to any other computer process, after the data is placed
into memory.
3)
Immediately
after scattering the bag of atoms (from one node in the SLIP Framework) into
objectSpace, each individual atom will be subsumed as a compound object with
one atom. Locate of the compound and
compound valances are acquired from the atoms contained in the compound object.
4)
The
dissipative/Excapement algorithms are run iteratively to cause a process
compartment having structure and stability.
5)
Compounds are
merged until the bag of atoms is fully resolved. As this process develops, there are fewer and fewer
compounds. The aggregation process
stops when the atom valances are fully resolved – just like in physical
chemistry.
6)
Also just like
in physical chemistry, there are sometimes meta-stable states that move the
compound dynamics back and forth between two or more “situated
resolutions”.
7)
Due to the fact
that the bag may have atoms with resolution only possible to atoms outside the
bag, the fully resolved compound is relative to the bag, and may have valance
(linkage) to compounds associated with other nodes in the SLIP Framework.
At any time the
entire state produces by the aggregation process is recorded in the various
objects in memory. This information can
be expressed into a text file.
Thus Mathematica or any other process can take up the transforms of the bag of atoms at any point during the formative process. The interface that allows one to move the bag over to a Mathematica visualization environment is straightforward. This is the perfect way to develop different types of chemistry. Once chemistry is developed, then the transformation can be encoded as Referential Information Base (lines and tensors as in the scatter gather to the circle.) This provide very fast processing.
Figure 5: The SenseMaking environment for SLIP
Figure 5 can be regarded as an IDEF type diagram with the following inputs or outputs (starting in the upper right corner):
1) Event Log: The data source has to be in the form of an event log. The event can be a text mining process of a data mining process that is discovering and placing into the record event profiles. Examples so far are (a) Intrusion Detection Audit logs, (b) FTP log file dump, and a crude thematic parser as applied to the BCNGroup Fable collection. Human work is required to identify proper event logs.
2) Human Judgment: The
development of the Analytic Conjecture is aided by the SLIP warehouse Browser
(see exercises)
3) Human Control of Categorization:
Human control can be exercised in either one of the two levels of
abstraction (see section
5.2 of the First Report on eventChemistry )
4) Human Intervention: Actions are taken due to information
delivered to the Invariance Detection Cycle.
5) Human Control over relevance metrics: Reinforcement learning, such as found in
artificial neural network classifiers, can be used in conjunction with manual
changes in the way the over all system is working.
This type of
SenseMaking system is being considered for implementation as an incident
management system.